A Discriminative Approach to Japanese Zero Anaphora Resolution with Large-scale Lexicalized Case Frames
نویسندگان
چکیده
We present a discriminative model for Japanese zero anaphora resolution that simultaneously determines an appropriate case frame for a given predicate and its predicate-argument structure. Our model is based on a log linear framework, and exploits lexical features obtained from a large raw corpus, as well as non-lexical features obtained from a relatively small annotated corpus. We report the results of zero anaphora resolution on Web text and demonstrate the effectiveness of our approach. In addition, we also investigate the relative importance of each feature for resolving zero anaphora in Web text.
منابع مشابه
A Fully-Lexicalized Probabilistic Model for Japanese Zero Anaphora Resolution
This paper presents a probabilistic model for Japanese zero anaphora resolution. First, this model recognizes discourse entities and links all mentions to them. Zero pronouns are then detected by case structure analysis based on automatically constructed case frames. Their appropriate antecedents are selected from the entities with high salience scores, based on the case frames and several pref...
متن کاملAutomatic Construction of Nominal Case Frames and its Application to Indirect Anaphora Resolution
This paper proposes a method to automatically construct Japanese nominal case frames. The point of our method is the integrated use of a dictionary and example phrases from large corpora. To examine the practical usefulness of the constructed nominal case frames, we also built a system of indirect anaphora resolution based on the case frames. The constructed case frames were evaluated by hand, ...
متن کاملA Large Scale Database of Strongly-related Events in Japanese
Tomohide Shibata, Shotaro Kohama and Sadao Kurohashi Graduate School of Informatics, Kyoto University Yoshida-honmachi, Sakyo-ku, Kyoto, 606-8501, Japan {shibata, kohama, kuro}@nlp.ist.i.kyoto-u.ac.jp Abstract The knowledge about the relation between events is quite useful for coreference resolution, anaphora resolution, and several NLP applications such as dialogue system. This paper presents ...
متن کاملThe Effect of Corpus Size on Case Frame Acquisition for Discourse Analysis
This paper reports the effect of corpus size on case frame acquisition for discourse analysis in Japanese. For this study, we collected a Japanese corpus consisting of up to 100 billion words, and constructed case frames from corpora of six different sizes. Then, we applied these case frames to syntactic and case structure analysis, and zero anaphora resolution. We obtained better results by us...
متن کاملImproving Japanese Zero Pronoun Resolution by Global Word Sense Disambiguation
This paper proposes unsupervised word sense disambiguation based on automatically constructed case frames and its incorporation into our zero pronoun resolution system. The word sense disambiguation is applied to verbs and nouns. We consider that case frames define verb senses and semantic features in a thesaurus define noun senses, respectively, and perform sense disambiguation by selecting th...
متن کامل